Back

Computers in Biology and Medicine

Elsevier BV

Preprints posted in the last 30 days, ranked by how well they match Computers in Biology and Medicine's content profile, based on 120 papers previously published here. The average preprint has a 0.15% match score for this journal, so anything above that is already an above-average fit.

1
Bridging Acoustic and Semantic Spaces for Interpretable Voice Scoring via Zero-Shot Semantic Expansion

Hsiao, C.; Cheng, Y.-R.; Yang, C.-Y.; Hsu, F.-S.

2026-06-01 health informatics 10.64898/2026.05.29.26354442 medRxiv
Top 0.1%
18.6%
Show abstract

Subjective auditory-perceptual evaluation and uninterpretable deep learning models limit the clinical assessment of voice disorders. This study proposes a two-phase zero-shot framework to evaluate voice pathology. First, an Audio Spectrogram Transformer is fine-tuned on the Perceptual Voice Quality Database to generate an acoustic latent space. Second, Orthogonal Procrustes analysis maps these acoustic embeddings directly onto the semantic space of a pre-trained Sentence Transformer. The geometric alignment produced continuous semantic axes that outperformed a supervised machine learning baseline in regressing clinician-rated GRBAS (Grade, Roughness, Breathiness, Asthenia, and Strain) severity scales. Furthermore, these axes correlate with traditional acoustic measures, including Harmonics-to-Noise Ratio and local jitter, while remaining robust when applied to aperiodic signals by not requiring fundamental frequency extraction. Most importantly, the model achieved zero-shot semantic expansion, successfully evaluating voices using an untrained, natural clinical vocabulary beyond the GRBAS scale. External validation on the Voice ICarus Database confirmed cross-corpus stability and demonstrated the capacity for zero-shot differential phenotyping of specific etiologies, such as hypokinetic dysphonia and reflux laryngitis. By bridging acoustic and semantic latent spaces, this framework offers an objective, continuous, and transparent metric for evaluating voice quality using voice descriptive vocabulary.

2
A Consensus-Driven Stacking Ensemble Framework for Interpretable Cardiovascular Risk Prediction and Clinical Deployment

Sozol, S. S.; Dev Nath, B. C.; Fahim, F. M. S.; Suzana, N. N.; Mirza, J. F.; Ahmmed, S.; Zohra, F.-T.; Zafr, A. H. A.; Uddin, M. N.; Mondal, M. R. H.; Hoque, A. S. M. L.

2026-05-26 health informatics 10.64898/2026.05.18.26352989 medRxiv
Top 0.1%
12.7%
Show abstract

Machine learning (ML) is being considered to help diagnose cardiovascular diseases (CVD). Still, challenges like inconsistent and limited datasets, limited infrastructure, and global inequalities lead to the need for a reliable and practicable ML solution. This paper presents an ML-driven framework for predicting CVD risk scores and classifying status. Several data preprocessing techniques, including multiple imputation by chained equations (MICE), outlier removal, are considered. In addition, hyperparameter tuning is performed with the GridSearchCV tuning technique. Moreover, a consensus-driven five-feature selection method is applied to identify optimal predictors. The dataset used in this study contains healthcare records related to future CVD risk scores, comprising 1,529 patient records with 22 features. The optimized stacked ensemble model is applied to the dataset and achieves a cross-validated coefficient of determination value of 98.13% for CVD risk score regression. Comparative evaluation with other ML models confirmed improved accuracy, efficiency, and interpretability. The explainable AI technique SHAP is applied to interpret predictions and highlight key risk factors. Moreover, a deployment-ready web platform with multi-role access has been developed that demonstrates clinical applicability. The proposed framework offers a reliable and interpretable tool for early detection of CVD and personalized risk assessment. In the future, this work can be extended to integrate longitudinal data, medical imaging, and deep learning to improve generalizability and strengthen real-world impact.

3
Enhanced precision of tensor electrocardiography through increased cumulative distribution function resolution: Validation in healthy individuals

TSUKADA, Y. T.; Hirayama, H.; Yodogawa, K.; Murata, H.; Iwasaki, Y.-k.; Fujino, T.; Shiozawa, A.; Tsukada, S.

2026-06-02 cardiovascular medicine 10.64898/2026.05.31.26354561 medRxiv
Top 0.1%
10.1%
Show abstract

Deep-learning ECG analysis is advancing rapidly but lacks stable, physiologically interpretable indicators to anchor explainable artificial intelligence (AI). Tensor cardiography (TCG) models electrocardiographic (ECG) waveforms as differences between pairs of cumulative distribution functions (CDFs), representing collective myocardial action potential transitions. However, the original 4-CDF model has limitations in fitting P waves and complex QRST patterns. This study aimed to evaluate whether increasing the number of CDFs from 4 to 10 improves TCG fitting accuracy and to characterize normative distributions of 10-CDF parameters in healthy individuals. Participants were recruited through occupational health screening at Tobu Railway Co., Ltd. (n = 415) and from the Nippon Medical School Hospital ECG database (n = 29). Standard 12-lead ECGs from 444 healthy participants, including 345 men and 99 women with a mean age of 46.9 years, were analyzed using TCG software. Reconstruction accuracy was assessed using RMSE, paired t-tests, and Cohens d. The 10-CDF model achieved significantly lower RMSE values across all leads than the 4-CDF model, with all p values < 0.0001 and very large effect sizes. In representative leads, RMSEs for the 4-CDF versus 10-CDF models were 0.0256 versus 0.0061 in lead II, 0.0230 versus 0.0063 in lead V1, and 0.0265 versus 0.0062 in lead V5. The coefficient of determination improved from a median of 0.952 with the 4-CDF model to 0.997 with the 10-CDF model in lead II. Parameter dispersion was reduced, suggesting improved estimation stability. Two new parameters, T_mean_diff and RT_mean_duration, were derivable from the expanded model; RT_mean_duration showed significant correlations with age and body surface area. In conclusion, increasing the CDF resolution from 4 to 10 significantly enhanced ECG waveform reconstruction accuracy and parameter stability. These findings provide normative distributions of 10-CDF TCG parameters and may support future explainable AI-based ECG analysis.

4
E-InfertilityTest: An Explainable AI Framework for Male Infertility Assessment

Das, G.; Ghosh, B.; Ghosh, Z.

2026-05-25 bioinformatics 10.64898/2026.05.21.726746 medRxiv
Top 0.1%
8.6%
Show abstract

Male infertility has emerged as a significant concern in modern society, with genetic defects as one of the major underlying cause behind it. This impairment negatively impacts sperm motility and morphology, leading to conditions such as Asthenozoospermia (reduced sperm motility), Teratozoospermia (abnormal sperm morphology) and sometimes Asthenoteratozoospermia (both motility and morphology defects). Assisted reproductive technologies (ART), such as in-vitro fertilization (IVF), offer a potential solution for such cases but with a low success rate. Classical semen analysis provides only a phenotypic snapshot without revealing the fertilizing potential of the sperms. Hence, in order to screen the functional sperm population as well as to get a deeper insight into the reasons underlying the aberrant sperm population, it is important to study their genetic profile. In this work, we have performed a meta analysis of the transcriptomic data of infertile sperms from Asthenozoospermia and Teratozoospermia patients with that from fertile sperms of normal individuals. Thereafter we have screened a signature gene set which has been used to develop a prediction model named Explainable Infertility Test (E-InfertilityTest) to classify between fertile versus infertile sperm at the preliminary level. For each prediction, it will also provide the set of genes which are playing a dominant role towards such prediction. Thus, it will provide patient specific dominant gene expression profile responsible for the aberration. This work warrants validation experiments in future to substantiate the models performance in a clinical setting. User can access the tool named E-InfertilityTest as a standalone version on GitHub. Github Linkhttps://github.com/zglabDIB/einfertility.git

5
Vascular Deformation Mapping Calibration with Physics-based Synthetic Data on Multi-axial Aortic Motion

Kim, T.; Baker, T.; Burris, N.; Figueroa, A.

2026-05-22 bioengineering 10.64898/2026.05.20.726669 medRxiv
Top 0.2%
8.3%
Show abstract

Aortic stiffness is both heterogenous and anisotropic. Current non-invasive methods to estimate aortic stiffness are limited to characterizing the aortic tissue as isotropic due to the lack the techniques required to extract multi-axial strain from 3D dynamic images. Vascular deformation mapping (VDM) is a nonrigid image registration technique which has thus far been applied to map aortic growth using longitudinal imaging. In this study, we propose to use VDM to assess 3D aortic deformation by mapping diastolic and systolic images. During image registration process, penalty parameters are employed to fine-tune image alignment and penalize non-physiological deformations. These penalty parameters must be calibrated to ensure that VDM successfully reproduces multi-axial aortic motion patterns in health and disease. In this paper, we developed a calibration pipeline for these parameters using synthetic data. A rotation-free shell model was used to generate physics-based synthetic data on aortic motion incorporating patient-specific geometries, root motion, and blood pressure from a cohort of 14 subjects (healthy, Marfans syndrome and thoracic aortic aneurysm). An error metric was defined to quantify the quality of the VDM results. Furthermore, a k-means clustering technique was used to categorize the subjects into three clusters based on ascending aortic motion. Optimal penalty parameters were identified for each of the three clusters. The results indicated that patient clusters with smaller aortic root motion required larger rigidity penalty values. The calibrated parameters successively reduced errors in 3D displacement and multi-axial stretch compared to un-optimized VDM predictions, enhancing the accuracy of capturing aortic deformation from dynamic images. Among the different aortic regions, the ascending thoracic aorta exhibits the largest error reduction.

6
Exploratory Assessment of Pulsed-Wave Doppler Representations of Lung Sounds Using Deep Learning: An In-Vitro Phantom Study

Saad, A. A.; Murthi, S. B.; Boctor, E. M.; Teeter, W. A.; Seam, N.

2026-06-10 respiratory medicine 10.64898/2026.06.09.26353787 medRxiv
Top 0.2%
6.9%
Show abstract

The increasing availability of portable ultrasound systems motivates exploration of novel approaches to respiratory signal assessment. In this in-vitro study, we investigate whether pulsed-wave (PW) Doppler ultrasound can capture structured spectral patterns from replayed lung sound recordings. Digitized respiratory sounds were replayed through a tissue-mimicking ultrasound phantom, generating 1,478 PW Doppler spectral images from recordings associated with healthy subjects and several externally labeled disease categories. Exploratory classification experiments using a ResNet-18 architecture demonstrated that these Doppler representations contain learnable differences under controlled conditions. These findings motivate further investigation into PW Doppler as a potential representation of respiratory acoustics.

7
ReMind: A Retrospective Self-Report Paradigm for Studying Mind-Wandering Onset During Reading

Sun, H.; Birney, A.; Singh, N.; Olszko, A.; Chen, P.; Ke, J.; Rosenberg, M. D.; Jangraw, D. C.

2026-05-18 bioengineering 10.64898/2026.05.14.725227 medRxiv
Top 0.2%
6.8%
Show abstract

Mind-wandering (MW) is a frequent and pervasive phenomenon, yet it is commonly assessed using self-reports or probe-based methods that offer limited temporal precision regarding its onset. In this study, we introduce a novel paradigm, ReMind, that estimates the onset and duration of MW episodes during natural reading by combining retrospective self-reports with eye-tracking. Participants indicated the words where they believed their mind started and stopped wandering, and these reports were aligned with gaze timestamps to estimate MW onset. Using data from 44 participants, we examined whether knowledge of MW onset improves the detection of MW from eye-tracking signals. To evaluate relevance for both self-report and thought-probe paradigms, we additionally simulated thought probes by randomly sampling time points during reading. Logistic regression classifiers trained on eye-tracking features extracted from time windows anchored to MW onset achieved AUROC scores of 0.659 and 0.621 under the self-report and simulated thought-probe paradigms, respectively, using leave-one-subject-out cross-validation. In both cases, onset-aligned windows outperformed classifiers trained using arbitrary MW windows. Sliding-window analyses further revealed systematic temporal changes around MW onset, with classification performance peaking at approximately 3 seconds after onset. Feature-level analyses showed reduced fixation rate and fixation dispersion, along with increased pupil size following MW onset. Together, these findings characterize the temporal progression from on-task reading to MW. Overall, ReMind provides a useful framework for studying the temporal dynamics of MW during naturalistic reading.

8
A transformer model explaining mechanisms of drug therapeutic and adverse effects

Ke, J.; Melamed, R. D.

2026-05-13 genetic and genomic medicine 10.64898/2026.05.11.26352917 medRxiv
Top 0.3%
6.7%
Show abstract

Understanding which disease genes are altered by a drug can provide insight into the biology of effect, help us understand adverse drug effects, and suggest new drug uses. Here, we build on our model Draphnet in a new formulation with a similar goal. Draphnet was designed to explain drug therapeutic and side effects by learning a network connecting drugs to the disease genes they alter. Our new model, DraPhormer, has a similar goal but instead of relying on a linear model, learning of drug to gene connections uses a transformer model. DraPhormer integrates drug molecular data, disease genetics, and known drug effects on diseases, along with language models representing all of these entities. We show in simulations that DraPhormer can explain the genetic mechanisms of drug effects. Then, we present our design for incorporating drug and disease biology into the model. Finally, we benchmark the models ability to learn drug indications and side effects in real data.

9
Next-Generation Skin Cancer Detection Using Efficient Fuzzy Fusion of Genomic and Imaging Data

Molla, A. R.; Maity, A.; Saha, S.; Bhattacharya, R.; Chakraborty, A.; Biswas, S.; Nath, S.

2026-06-08 health informatics 10.64898/2026.06.05.26355024 medRxiv
Top 0.3%
6.5%
Show abstract

Skin cancer requires early detection for improved survival rates. Most existing methods rely on deep learning based image classification, which is affected by visual similarity among lesions. Fewer studies use Gene Expression (GE) analysis, which captures molecular characteristics but lacks structural and visual details. To overcome limitations of individual modalities, this paper proposes a multimodal framework integrating dermoscopic images and GE profiles for skin cancer classification. EfficientNet and logistic regression are used for image based analysis and genomic skin lesion profiling, respectively, followed by fuzzy rule based decision systems to reduce uncertainty within individual modalities. Finally, fuzzy fusion combines predictions from both modalities using uncertainty based weighting of classifier outputs. The experimental findings show that both the image based and GE based classification models individually achieved accuracies of nearly 92%. However, the integration of prediction results through the proposed fuzzy fusion strategy further enhanced the classification performance, achieving an overall accuracy of 94.25%. The results obtained outperform contemporary methods, highlighting the effectiveness of combining complementary multimodal information compared with single modality approaches.

10
Automated identification of bolus types in modified barium swallow studies using deep learning: a preliminary study

Mao, S.; Sahli, A. J.; Buoy, S. N.; Hutcheson, C.; Gelabert, G. A.; Barbon, C. E. A.; Naser, M. A.; Fuller, C. D.; Brock, K. K.; Hutcheson, K. A.

2026-05-20 radiology and imaging 10.64898/2026.05.16.26353385 medRxiv
Top 0.3%
6.5%
Show abstract

Purpose: Modified Barium Swallow (MBS) studies utilize videofluoroscopy, a dynamic X-ray technique for evaluating swallowing anatomy and physiology. Each MBS exam typically includes multiple bolus trials, often involving different bolus consistencies. Accurate classification of bolus types is essential, as swallowing dynamics, aspiration risks, and residue levels vary with bolus consistency. In this preliminary study, we propose a deep learning-based approach for automated bolus type classification in MBS, aiming to provide a standardized and efficient framework for automated processing of swallowing assessments. Methods: A total of 206 patients (Mean +/- SD age: 60.24 +/- 9.02 years; 89.32% men) underwent MBS examinations, comprising 277 individual MBS studies. The dataset included 2,752 bolus-level video segments, categorized by bolus type as follows: 1,711 liquid (IDDSI 0-3, 62.17%), 521 pudding (IDDSI 4, 18.93%), and 520 solid boluses (IDDSI 7, cookie or cracker, 18.89%). To standardize variable video lengths for the data pipeline, each MBS video was temporally segmented into a fixed-length frame sequence, with shorter videos padded using static frames and longer videos randomly cropped to the target length. We employed an Inflated 3D convolutional neural network to develop the deep learning model. Results: Each video segment contained an average of 273.03 +/- 195.81 frames. On the independent test set, the deep learning model achieved an overall accuracy of 96.13%, and the macro F1-score was 95.05% in classifying food bolus types within MBS videos. Conclusions: The developed AI-based system demonstrated effective automated classification of food bolus types in MBS videos, representing an important step toward fully automated MBS analysis for swallowing efficiency assessment. The AI model reduces the reliance on manual labels, thereby promising to streamline clinical and research workflows.

11
Multi-Agent AI for Chest Radiography: A Sequential Segmentation and LLM-Driven Consultative Tool for Medical Training

Kurt, F.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354432 medRxiv
Top 0.4%
6.2%
Show abstract

Background: Traditional diagnostic models lack explainability, while multimodal language models prone to hallucination remain unsafe for medical education. An interactive, risk-free artificial intelligence framework is required to serve as a reliable clinical mentor for radiology trainees. Methods: We propose a multi-agent architecture decoupling deterministic image analysis from generative consultation. Specialized computer vision models perform anatomical localization and pathological segmentation. These quantitative outputs are synthesized into a structured payload, which grounds a locally hosted large language model (LLaVA 7B) using strict prompt guardrails and prerequisite protocols. Results: The system effectively eliminates visual hallucinations by intercepting unanchored queries. The artificial intelligence tutor successfully contextualizes spatial anomalies and baseline metrics, generating accurate conversational explanations and formally structured radiology reports while strictly enforcing medical safety disclaimers. Discussion and Conclusion: By anchoring language generation exclusively to verified algorithmic realities, this framework transforms opaque diagnostic models into safe, interactive educational simulators. This establishes a highly reliable paradigm for integrating explainable artificial intelligence into medical training.

12
A Supervised Learning Framework for Stroke Hospitalization Factors Selection Using the Lasso-MIDAS Model

Li, Q.; Wang, L.

2026-05-20 cardiovascular medicine 10.64898/2026.05.15.26353365 medRxiv
Top 0.4%
5.1%
Show abstract

Stroke, as an acute cerebrovascular disease with significant public health implications, is influenced by a complex interplay of meteorological conditions, air quality, and socioeconomic factors. However, the inherent challenges of mixed-frequency data from diverse sources and high-dimensional variable spaces limit the effectiveness of traditional regression models. This study develops a Lasso-MIDAS model framework to identify the key multidimensional drivers of stroke admissions. Using this approach, 21 candidate variables encompassing meteorological, environmental, and economic indicators were screened. The empirical results identified 11 core influencing factors. In the meteorological and environmental dimensions, Wind Speed, Carbon Monoxide (CO), and Sulfur Dioxide (SO2) were identified as significant positive drivers, with Temperature Difference also positively correlating with admission risks. Conversely, Nitrogen Dioxide (NO2) exhibited a negative correlation, potentially reflecting behavioral adaptation and exposure reduction during peak pollution periods. In the socioeconomic dimension, the Consumer Price Index (CPI) for Food, Tobacco, and Alcohol emerged as a major risk factor, highlighting the impact of living cost pressures on public health. The findings demonstrate the superiority of the Lasso-MIDAS model in handling large-scale healthcare data. It effectively addresses the frequency mismatch problem while enhancing the robustness of causal identification through variable shrinkage. These conclusions provide a scientific basis for health authorities to establish early warning systems and optimize public health policy interventions.

13
Weight-Guided Constraints for Body Model and Lead Selection in Pediatric CIED MRI Safety Simulations

Hameed, S.; Henry, K.; Jiang, F.; Bhusal, B.; Dillenbeck, H.; Gakenheimer-Smith, L.; Webster, G.; Golestani Rad, L.

2026-05-30 radiology and imaging 10.64898/2026.05.26.26354162 medRxiv
Top 0.4%
4.9%
Show abstract

Pediatric patients with cardiac implantable electronic devices (CIEDs) face limited MRI access due to RF-induced heating, and computational modeling is increasingly used to characterize this risk. The validity of these simulations, however, depends on pairing body models with clinically realistic lead configurations, guidance that is currently lacking. We retrospectively analyzed 302 CIED surgeries in 281 pediatric patients to derive weight-based constraints for simulation design. Weight alone discriminated epicardial from endocardial lead implantation with AUC = 0.90, and adding age and height yielded no improvement, supporting weight as a sufficient single-parameter selection metric. The probabilistic crossover between approaches occurred at 44~kg, substantially higher than the 10 to 15~kg threshold commonly cited in the literature, with a broad transition zone of 21 to 66~kg in which both lead types were routinely used. Lead length was likewise weight-constrained: only 25~cm leads were observed in patients below 6~kg, and leads of 45~cm or longer were uncommon below 50~kg. These findings yield a three-tier framework, with epicardial-only configurations below 21~kg, dual configurations within 21 to 66~kg, and weight-thresholded lead lengths throughout, enabling MRI safety simulations to focus on clinically realizable anatomy and device combinations.

14
An Interpretable Multimodal Framework for Student Mental Health Risk Assessment Using Temporal Embeddings and Fuzzy Inference

Shah, A.; Mehta, A.; Bhensdadia, C. K.

2026-05-20 health informatics 10.64898/2026.05.16.26352630 medRxiv
Top 0.5%
4.3%
Show abstract

Mental health challenges among university students have increased due to academic pressure, lifestyle changes, and continuous digital engagement. Existing approaches for mental health assessment often rely either on self-reported psychological scales or isolated behavioral indicators, limiting their ability to capture complex temporal and contextual patterns. This study proposes an interpretable multimodal framework for student mental health risk assessment using behavioral sensing, academic information, ecological momentary assessments (EMA), and psychometric survey data. A bidirectional Long Short-Term Memory autoencoder is employed to learn latent temporal representations from day-level behavioral sequences, while graph embeddings capture structural relationships among students using similarity-based neighborhood graphs. These representations are fused with academic and survey-derived features and reduced using Principal Component Analysis and Uniform Manifold Approximation and Projection. K-means clustering is then applied to identify behaviorally distinct student groups. Experimental analysis on the StudentLife dataset demonstrates meaningful clustering performance with a Silhouette Score of 0.4209 and Adjusted Rand Index stability of 0.6869. The identified clusters correspond to low-risk, moderate-risk, and high-risk behavioral profiles. To improve interpretability and practical usability, a fuzzy inference system is introduced to compute mental risk, academic risk, and wellbeing indices using psychometric indicators including PHQ-9, PSS, PANAS, VR-12, and Big Five personality traits. The results demonstrate the potential of combining multimodal behavioral modeling with interpretable fuzzy reasoning to support early mental health risk assessment in educational settings.

15
A direct forcing immersed boundary method for biofluid simulations using a non-linear rotation free shell model on unstructured grids

Kim, T.; Malipeddi, A. R.; Capecelatro, J.; Figueroa, A.

2026-05-19 bioengineering 10.64898/2026.05.16.725689 medRxiv
Top 0.6%
4.3%
Show abstract

Thin structures such as heart valves and aortic dissection flaps interact dynamically with blood flow in human vessels. Their flexibility and capacity for large deformations generate complex, highly transient hemodynamic patterns over the cardiac cycle. Accurately resolving these interactions remains challenging for conventional boundary-fitted fluid-structure interaction approaches. We present an immersed boundary method for simulating thin structures in incompressible flow on unstructured grids. The method couples a stabilized finite element fluid solver with a nonlinear, rotation-free shell formulation through a direct forcing immersed boundary approach. The framework supports both weak (explicit) and strong (implicit) time-coupling strategies, enabling stable simulations over a wide range of solid-to-fluid density ratios. Hydrodynamic forces acting on thin structures are computed from fluid solutions sampled on both sides of the structure, allowing accurate force reconstruction for zero-thickness shells. To our knowledge, this is the first immersed boundary formulation that couples an unstructured finite element fluid solver with a two-dimensional, rotation-free shell model to simulate interactions between thin structures and incompressible flow. Fluid-structure coupling is achieved using predefined finite element shape functions, which provide consistent projection between Eulerian and Lagrangian fields without additional interpolation procedures. The framework is validated using three-dimensional benchmark problems involving thin structures. Then, valve-like model is used to compare strong and weak coupling strategies. Finally, the method is applied to an idealized type-B aortic dissection model. The proposed approach is implemented within the open-source software CRIMSON, a finite element platform for cardiovascular simulation.

16
MASHA: A Multi-Agent System for Healthcare Sentiment Analysis Using AI for Migraine Detection in Arabic Tweets

Baroud, S.

2026-05-22 health informatics 10.64898/2026.05.21.26352626 medRxiv
Top 0.6%
4.3%
Show abstract

Migraine detection and sentiment analysis in healthcare have become increasingly important, particularly with the rise of social media platforms like Twitter, where users often share their personal health experiences. This study presents MASHA (Multi-Agent System for Healthcare Sentiment Analysis), an artificial intelligence (AI)-driven framework that integrates multiple machine learning (ML) models for sentiment analysis of Arabic tweets related to migraines. The system leverages a multi-agent architecture to handle tasks such as data acquisition, pre-processing, model training and real-time decision-making. Key ML models, including Support Vector Machines (SVM), Naive Bayes (NB) and Logistic Regression (LR), are integrated using ensemble techniques, leading to improved classification performance. Experiments conducted on a dataset of Arabic tweets demonstrate that MASHA outperforms traditional methods, achieving an accuracy of 90.0% and an F1-score of 89.46%. Moreover, the system's scalability and flexibility make it suitable for real-time public health monitoring, offering valuable insights into patient experiences and public sentiment regarding healthcare services. MASHA's adaptability suggests its potential application for analysing other healthcare-related conditions, reinforcing the system's scalability and broader relevance. Future work will focus on incorporating deep learning (DL) models and expanding the dataset with content from additional social media platform.

17
AutoClip: AI-Guided TEE Semantic Segmentation for TEER A Proof-of-Concept Study

Chen, M.; Li, X.; Yang, K.; Taramasso, M.

2026-06-06 cardiovascular medicine 10.64898/2026.05.29.26354195 medRxiv
Top 0.6%
4.2%
Show abstract

**Abstract** **Background:** Transcatheter edge-to-edge repair (TEER) is an established treatment for mitral regurgitation but remains highly dependent on operator experience and complex transesophageal echocardiography (TEE)-guided intraprocedural imaging. Artificial intelligence (AI)-based semantic segmentation may improve procedural reproducibility and intraprocedural guidance; however, no TEER-specific segmentation framework has been reported. **Objectives:** To develop and evaluate AutoClip, a clinician-driven AI-guided TEE semantic segmentation model designed for simultaneous delineation of mitral valve anatomy and in-vivo TEER device components. **Methods:** A retrospective proof-of-concept study was conducted using 987 intraprocedural TEE frames derived from 10 video clips in 3 patients undergoing MitraClip G4 implantation. Seven semantic labels, including mitral leaflets and device components, were manually annotated using ITK-SNAP. Following standardized preprocessing and region-of-interest extraction, an Attention U-Net architecture was trained frame-wise on bicommissural and corresponding X-plane TEE views. Model performance was assessed using mean intersection-over-union (IoU) and Dice coefficient on an independent test set. **Results:** The Attention U-Net demonstrated improved sensitivity to small device structures compared with conventional U-Net architectures. Preliminary training performance achieved a mean IoU of approximately 0.93, while independent test performance reached a mean IoU of 0.46 across foreground classes. Qualitative assessment demonstrated feasible simultaneous segmentation of mitral leaflets, clip arms, grippers, and delivery shaft during TEER procedures. **Conclusions:** AutoClip represents a proof-of-concept TEER-specific TEE semantic segmentation framework initiated through a clinician-oriented workflow without formal computer science expertise. Although preliminary accuracy remains modest due to limited sample size, this study establishes a reproducible pathway for future AI-assisted intraprocedural guidance systems and larger multicenter development efforts in structural heart interventions.

18
Genome-wide computational prediction of miRNAs encoded by influenza A virus (H3N2) predicts target genes involved in pulmonary and antiviral innate immunity

Siddiqi, M. A.; Kumar, H.; Mazumder, M.

2026-05-18 bioinformatics 10.64898/2026.05.18.725090 medRxiv
Top 0.6%
4.0%
Show abstract

Influenza A virus (IAV) causes significant morbidity and mortality worldwide. Understanding how viral RNAs may regulate host genes through microRNA-like mechanisms can clarify pathogenesis and reveal therapeutic targets. In this study, we screened all eight IAV H3N2 RNA segments (PB2, PB1, PA, HA, NP, NA, M, and NS) using an ab initio computational pipeline; five segments (PB2, PB1, PA, HA, and M) met the VMir scoring threshold for further analysis, while NP, NA, and NS were excluded due to low pre-miRNA scores. Mature miRNAs were identified using MatureBayes, and target genes in the human genome were predicted with the miRDB server. From these targets, we selected two genes per qualifying segment (10 genes total) based on their functional relevance to influenza infection and supporting literature; all selected genes are unique to their respective segment. We identified 10 segment-specific target genes (IFNL1, DDX60, SAMHD1, MAVS, IRF4, BIRC2, AGO1, MAP3K1, NOD1, and TNFAIP1) and one common target across all five analyzed segments (CADM2). Gene Ontology and pathway analyses showed enrichment in interferon signaling, RIG-I-like receptor pathways, antiviral restriction, RNA interference, and inflammatory responses. Literature supports roles for these genes in pulmonary and antiviral innate immunity. Our findings provide a basis for experimental validation and may help the research community better understand influenza virus pathogenesis and identify novel therapeutic candidates. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=111 SRC="FIGDIR/small/725090v1_ufig1.gif" ALT="Figure 1"> View larger version (33K): org.highwire.dtl.DTLVardef@2b14adorg.highwire.dtl.DTLVardef@5a9b2eorg.highwire.dtl.DTLVardef@81ffc1org.highwire.dtl.DTLVardef@be119b_HPS_FORMAT_FIGEXP M_FIG C_FIG

19
Hybrid Neural--Bayesian Belief Network Framework for Uncertainty-Aware Multimodal GBM Prediction

Jayme, A.; Heuveline, V.

2026-05-13 health informatics 10.64898/2026.05.10.26352710 medRxiv
Top 0.7%
3.9%
Show abstract

Background and ObjectiveGlioblastoma outcome prediction remains difficult because clinically relevant signals are distributed across heterogeneous imaging and genomic modalities, cohorts are small, and conventional neural predictors do not quantify their own uncertainty. This study evaluates a hybrid neural-Bayesian belief network framework for uncertainty-aware multimodal glioblastoma prediction and examines how modality selection, model family, and structure-aware regularization affect predictive performance and confidence quality. MethodsThe framework was evaluated on the TCGA-GBM radiogenomic cohort using four input modalities (T1Gd, FLAIR, mRNA, and CNA), five model families, five structural-weight settings, and 15 view subsets. A secondary benchmark on the UCI Human Activity Recognition dataset was included to assess whether observed limitations were specific to the glioblastoma setting. ResultsCNA features consistently reduced performance in most multimodal settings, and selective fusion excluding CNA outperformed both the full four-view baseline and imaging-only alternatives. Model families showed clear differences in uncertainty behaviour: non-Bayesian families achieved the strongest predictive accuracy, whereas the Bayesian family achieved the lowest calibration error over a narrower confidence range. Bayesian belief network regularization produced consistent directional improvements without supporting reliable structure-discovery claims, as learned graph structures were not reproducible across folds. On the secondary bench-mark, the same framework achieved much higher predictive performance, indicating that the glioblastoma performance ceiling primarily reflects data limitations rather than an architectural constraint. ConclusionsIn small-sample radiogenomic prediction, modality choice is at least as important as model choice, and uncertainty quality differs substantially across uncertainty-aware model families. The proposed framework provides a practical basis for comparing accuracy, calibration, modality selection, and structure-aware regularization in multimodal biomedical prediction.

20
A multi-modal phase plane method for constructing multivariate disease trajectories.

Cox, T.; Shishegar, R.; Bourgeat, P.; Cespedes, M.; Dore, V.; Doecke, J. D.; Fripp, J. D.; Rowe, C. C.; Masters, C. L.; Villemagne, V. L. C.; Burnham, S.

2026-05-17 health informatics 10.64898/2026.05.13.26353085 medRxiv
Top 0.7%
3.9%
Show abstract

Understanding the sequential order and timing of different biomarkers in the progression of Alzheimer's disease (AD) is paramount for understanding the pathophysiology of the disease, leading to better staging and improved prediction of clinical progression, providing crucial knowledge for the design and timing of effective clinical therapeutic trials. This study developed and evaluated a multi-modal phase plane (MMPP) method to construct long-term multivariate disease trajectory curves from short term longitudinal data for neuro-degenerative diseases like AD. The MMPP method is an extension to a previously presented four-step method for constructing single variable disease trajectories. A novel anchoring step which uses study participants' multivariate data to infer the staging of the separate single variable progression trajectories allows multivariate disease trajectory curves to be generated. Further, the anchoring step provides disease staging at the individual level. A bootstrapping protocol was employed, providing confidence limits on the predictions. We demonstrate that the MMPP method is able to accurately reconstruct multivariate disease trajectory curves and individuals' disease stage from simulated noisy short term longitudinal data. Specifically, the method successfully estimated the delay times between distinct progressing variables and reliably predicted individual baseline disease times (r2 = 0.981) for participants exhibiting significant early biomarker deviations.